Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for telemetry metrics and tracing #1715

Merged
merged 1 commit into from
May 18, 2024
Merged

Add support for telemetry metrics and tracing #1715

merged 1 commit into from
May 18, 2024

Conversation

jstedfast
Copy link
Owner

No description provided.

@coveralls
Copy link

coveralls commented Mar 1, 2024

Coverage Status

coverage: 94.073% (-0.6%) from 94.692%
when pulling 9a6e1c5 on metrics
into c543f33 on master.

Copy link

@EugeneKrapivin EugeneKrapivin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left you some thoughts, take it or leave it. I didn't see anything that really is a problem! It's AMAZING how you've gone from 0 to hero in metrics gathering.

Once you have a (alpha|beta) nuget version, I'll try to create a Prometheus dashboard using my old POC project.

static TagList GetTags (IPAddress ip, string host, int port, Exception ex = null)
{
var tags = new TagList {
{ "network.peer.address", ip.ToString () },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should really avoid putting IPs, addresses and ports as tags in metrics. for use cases where you connect to few target servers it would be fine. however, if the target IP/host could change this could potentially cause the backing time-series database (like prometheus or influxDB) to operate poorly due to high cardinality tags, I've actually crashed our influxDB instance this way 🤣.
maybe, if the community wants such things you could add a config switch to add those. on the other hand these values could be probably filtered on export and before ingestion into the timeseries db.
your call :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I didn't think it'd be that sensitive. But yea, the network.peer.address is probably not all that useful compared to server.address (i.e. hostname) and that's probably good enough.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also kinda wondering if SocketMetrics should just go away since it probably doesn't add a whole lot of value other than the error values.

Perhaps this would be the kind of thing that would be better as log info rather than metrics.

public void RecordClientDisconnected (long startTimestamp, Uri uri, Exception ex = null)
{
if (ConnectionDuration.Enabled) {
var duration = TimeSpan.FromTicks (Stopwatch.GetTimestamp () - startTimestamp).TotalSeconds;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most often than not, recording time better done in milliseconds rather than seconds. It'll also be consistent with your OperationDuration metric units.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got this from HttpClient using seconds for connection_duration and was thinking that clients would realistically be kept connected for long enough durations that ms wouldn't make sense.

For IMAP, this would realistically be the case but perhaps for something like SmtpClient, it isn't necessarily true unless the developer is trying to keep a connection alive for re-use. Not sure how people are using MailKit's SmtpClient in server applications, so this may or may not be the case. System.Net.Mail's SmtpClient was much more like HttpClient in that it kept a connection pool in the background. MailKit, however, is a 1:1 mapping of Client to connection and developers might spawn up a new SmtpClient for every batch send that they do in which case ms would probably be a more useful metric.

Gotta think about the units for this, I guess.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be either, but it should be:

  1. consistent between multiple metrics from the same source, be predictable
  2. should be sensitive enough to show time for short processes and long processes while still being consistent

so either should be fine, just have to pick 1, document it and stick to it :)

{
var tags = new TagList {
{ "url.scheme", uri.Scheme },
{ "server.address", uri.Host },

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though HttpClient reports those tag values, you should also consider the best practices published by microsoft, specifically:

Beware of having very large or unbounded combinations of tag values being recorded in practice...Most metric collection tools will either drop data to stay within technical limits or there can be large monetary costs to cover the data storage and processing...Most metric collection tools will either drop data to stay within technical limits or there can be large monetary costs to cover the data storage and processing.

In some use cases MailKit could be used to relay emails to many servers (think about a multi-tenant scenario).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I can likely make this configurable.

this.activity = activity;
this.metrics = metrics;

if (activity is not null) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you might be better off with Activity.Current then passing it down into the constructor
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.activity.current?view=net-6.0

I'm not very fond of dynamic static (too much thread awareness for my personal taste), but it might be fine to avoid passing down the activity.

MailKit/Net/NetworkOperation.cs Outdated Show resolved Hide resolved
/// <remarks>
/// Telemetry constants for MailKit.
/// </remarks>
public static class Telemetry

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding another nuget MailKit.Extensions.Hosting (following a common convention https://www.nuget.org/packages?q=extensions.hosting)

the nuget could add multiple extensions methods on top of IServiceCollection like:

  • AddMailKitSmtp(config => {}), AddMailKitPop3(config => {}), AddMailKitImap(config => {})
  • AddMailKitSmtp("name", config => {}), AddMailKitPop3("name", config => {}), AddMailKitImap("name", config => {})
  • AddMailKitMetrics(config => {}) - this will add metrics reporting for mailkit
  • AddMailKitTracing(config => {}) - this will add tracing reporting for mailkit

it might be a bit out of scope, but it would probably be a natural continuation of this project 😄

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I was planning to do something like this. Just wasn't sure what dependency was needed (now I know!)

@EugeneKrapivin
Copy link

Wow AWESOME :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants